Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 432 | 432 |
| Missing cells (%) | 8.1% | 8.1% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 90 (20.2%) missing values | Age has 86 (19.3%) missing values | Missing |
Cabin has 341 (76.5%) missing values | Cabin has 344 (77.1%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 303 (67.9%) zeros | SibSp has 297 (66.6%) zeros | Zeros |
Parch has 330 (74.0%) zeros | Parch has 346 (77.6%) zeros | Zeros |
Fare has 10 (2.2%) zeros | Fare has 8 (1.8%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-05-07 20:16:43.433733 | 2024-05-07 20:16:47.432330 |
| Analysis finished | 2024-05-07 20:16:47.431175 | 2024-05-07 20:16:51.377516 |
| Duration | 4 seconds | 3.95 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 441.16143 | 448.84305 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 2 | 3 |
| Maximum | 888 | 890 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 2 | 3 |
| 5-th percentile | 44.25 | 38.25 |
| Q1 | 208.5 | 221.75 |
| median | 438.5 | 453.5 |
| Q3 | 671.25 | 665.25 |
| 95-th percentile | 831.75 | 847.75 |
| Maximum | 888 | 890 |
| Range | 886 | 887 |
| Interquartile range (IQR) | 462.75 | 443.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 258.04615 | 259.18817 |
| Coefficient of variation (CV) | 0.58492455 | 0.57745835 |
| Kurtosis | -1.250417 | -1.1734729 |
| Mean | 441.16143 | 448.84305 |
| Median Absolute Deviation (MAD) | 231.5 | 220 |
| Skewness | -0.012184406 | -0.044725792 |
| Sum | 196758 | 200184 |
| Variance | 66587.817 | 67178.506 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 31 | 1 | 0.2% |
| 668 | 1 | 0.2% |
| 417 | 1 | 0.2% |
| 158 | 1 | 0.2% |
| 208 | 1 | 0.2% |
| 823 | 1 | 0.2% |
| 14 | 1 | 0.2% |
| 466 | 1 | 0.2% |
| 250 | 1 | 0.2% |
| 673 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 154 | 1 | 0.2% |
| 259 | 1 | 0.2% |
| 420 | 1 | 0.2% |
| 604 | 1 | 0.2% |
| 816 | 1 | 0.2% |
| 256 | 1 | 0.2% |
| 143 | 1 | 0.2% |
| 656 | 1 | 0.2% |
| 345 | 1 | 0.2% |
| 350 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 18 | 1 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 4 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 18 | 1 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 4 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 18 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 15 | 1 | |
| 16 | 1 | |
| 18 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 0 |
| 2nd row | 1 | 1 |
| 3rd row | 0 | 1 |
| 4th row | 0 | 1 |
| 5th row | 1 | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 279 | |
| 1 | 167 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 279 | |
| 1 | 167 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 279 | |
| 1 | 167 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 279 | |
| 1 | 167 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 279 | |
| 1 | 167 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 271 | |
| 1 | 175 |
| Value | Count | Frequency (%) |
| 0 | 279 | |
| 1 | 167 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 3 |
| 2nd row | 1 | 2 |
| 3rd row | 3 | 3 |
| 4th row | 3 | 3 |
| 5th row | 3 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 242 | |
| 1 | 113 | |
| 2 | 91 | 20.4% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 115 | |
| 2 | 86 | 19.3% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 242 | |
| 1 | 113 | |
| 2 | 91 | 20.4% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 115 | |
| 2 | 86 | 19.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 242 | |
| 1 | 113 | |
| 2 | 91 | 20.4% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 115 | |
| 2 | 86 | 19.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 242 | |
| 1 | 113 | |
| 2 | 91 | 20.4% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 115 | |
| 2 | 86 | 19.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 242 | |
| 1 | 113 | |
| 2 | 91 | 20.4% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 115 | |
| 2 | 86 | 19.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 242 | |
| 1 | 113 | |
| 2 | 91 | 20.4% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 115 | |
| 2 | 86 | 19.3% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 65 | 65 |
| Median length | 47 | 48 |
| Mean length | 26.533632 | 26.647982 |
| Min length | 12 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 11834 | 11885 |
| Distinct characters | 59 | 60 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Uruchurtu, Don. Manuel E | van Billiard, Mr. Austin Blyler |
| 2nd row | Aubart, Mme. Leontine Pauline | Jacobsohn, Mrs. Sidney Samuel (Amy Frances Christy) |
| 3rd row | Laleff, Mr. Kristo | Healy, Miss. Hanora "Nora" |
| 4th row | Harknett, Miss. Alice Phoebe | Glynn, Miss. Mary Agatha |
| 5th row | Mannion, Miss. Margareth | Nakid, Miss. Maria ("Mary") |
| Value | Count | Frequency (%) |
| mr | 256 | 14.3% |
| miss | 89 | 5.0% |
| mrs | 58 | 3.2% |
| william | 38 | 2.1% |
| master | 27 | 1.5% |
| henry | 22 | 1.2% |
| john | 21 | 1.2% |
| anna | 13 | 0.7% |
| james | 10 | 0.6% |
| george | 10 | 0.6% |
| Other values (905) | 1246 |
| Value | Count | Frequency (%) |
| mr | 265 | 14.7% |
| miss | 92 | 5.1% |
| mrs | 59 | 3.3% |
| william | 33 | 1.8% |
| john | 24 | 1.3% |
| master | 18 | 1.0% |
| henry | 14 | 0.8% |
| james | 14 | 0.8% |
| charles | 12 | 0.7% |
| thomas | 11 | 0.6% |
| Other values (908) | 1255 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1346 | 11.4% | |
| r | 963 | 8.1% |
| e | 839 | 7.1% |
| a | 816 | 6.9% |
| i | 668 | 5.6% |
| n | 661 | 5.6% |
| s | 618 | 5.2% |
| M | 559 | 4.7% |
| l | 537 | 4.5% |
| o | 479 | 4.0% |
| Other values (49) | 4348 |
| Value | Count | Frequency (%) |
| 1351 | 11.4% | |
| r | 963 | 8.1% |
| e | 832 | 7.0% |
| a | 824 | 6.9% |
| i | 681 | 5.7% |
| s | 644 | 5.4% |
| n | 637 | 5.4% |
| M | 572 | 4.8% |
| l | 524 | 4.4% |
| o | 496 | 4.2% |
| Other values (50) | 4361 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 11834 |
| Value | Count | Frequency (%) |
| (unknown) | 11885 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1346 | 11.4% | |
| r | 963 | 8.1% |
| e | 839 | 7.1% |
| a | 816 | 6.9% |
| i | 668 | 5.6% |
| n | 661 | 5.6% |
| s | 618 | 5.2% |
| M | 559 | 4.7% |
| l | 537 | 4.5% |
| o | 479 | 4.0% |
| Other values (49) | 4348 |
| Value | Count | Frequency (%) |
| 1351 | 11.4% | |
| r | 963 | 8.1% |
| e | 832 | 7.0% |
| a | 824 | 6.9% |
| i | 681 | 5.7% |
| s | 644 | 5.4% |
| n | 637 | 5.4% |
| M | 572 | 4.8% |
| l | 524 | 4.4% |
| o | 496 | 4.2% |
| Other values (50) | 4361 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 11834 |
| Value | Count | Frequency (%) |
| (unknown) | 11885 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1346 | 11.4% | |
| r | 963 | 8.1% |
| e | 839 | 7.1% |
| a | 816 | 6.9% |
| i | 668 | 5.6% |
| n | 661 | 5.6% |
| s | 618 | 5.2% |
| M | 559 | 4.7% |
| l | 537 | 4.5% |
| o | 479 | 4.0% |
| Other values (49) | 4348 |
| Value | Count | Frequency (%) |
| 1351 | 11.4% | |
| r | 963 | 8.1% |
| e | 832 | 7.0% |
| a | 824 | 6.9% |
| i | 681 | 5.7% |
| s | 644 | 5.4% |
| n | 637 | 5.4% |
| M | 572 | 4.8% |
| l | 524 | 4.4% |
| o | 496 | 4.2% |
| Other values (50) | 4361 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 11834 |
| Value | Count | Frequency (%) |
| (unknown) | 11885 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1346 | 11.4% | |
| r | 963 | 8.1% |
| e | 839 | 7.1% |
| a | 816 | 6.9% |
| i | 668 | 5.6% |
| n | 661 | 5.6% |
| s | 618 | 5.2% |
| M | 559 | 4.7% |
| l | 537 | 4.5% |
| o | 479 | 4.0% |
| Other values (49) | 4348 |
| Value | Count | Frequency (%) |
| 1351 | 11.4% | |
| r | 963 | 8.1% |
| e | 832 | 7.0% |
| a | 824 | 6.9% |
| i | 681 | 5.7% |
| s | 644 | 5.4% |
| n | 637 | 5.4% |
| M | 572 | 4.8% |
| l | 524 | 4.4% |
| o | 496 | 4.2% |
| Other values (50) | 4361 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.6816143 | 4.6950673 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2088 | 2094 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | male |
| 2nd row | female | female |
| 3rd row | male | female |
| 4th row | female | female |
| 5th row | female | female |
Common Values
| Value | Count | Frequency (%) |
| male | 294 | |
| female | 152 |
| Value | Count | Frequency (%) |
| male | 291 | |
| female | 155 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 294 | |
| female | 152 |
| Value | Count | Frequency (%) |
| male | 291 | |
| female | 155 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 598 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 152 | 7.3% |
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2088 |
| Value | Count | Frequency (%) |
| (unknown) | 2094 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 598 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 152 | 7.3% |
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2088 |
| Value | Count | Frequency (%) |
| (unknown) | 2094 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 598 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 152 | 7.3% |
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2088 |
| Value | Count | Frequency (%) |
| (unknown) | 2094 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 598 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 152 | 7.3% |
| Value | Count | Frequency (%) |
| e | 601 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 155 | 7.4% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 74 | 76 |
| Distinct (%) | 20.8% | 21.1% |
| Missing | 90 | 86 |
| Missing (%) | 20.2% | 19.3% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 28.671124 | 29.907889 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.42 |
| Maximum | 74 | 74 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.42 |
| 5-th percentile | 4 | 3.95 |
| Q1 | 19 | 21 |
| median | 28 | 28 |
| Q3 | 38 | 38.25 |
| 95-th percentile | 54 | 58 |
| Maximum | 74 | 74 |
| Range | 73.58 | 73.58 |
| Interquartile range (IQR) | 19 | 17.25 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.591218 | 14.704157 |
| Coefficient of variation (CV) | 0.50891687 | 0.49164811 |
| Kurtosis | 0.086597151 | 0.1637151 |
| Mean | 28.671124 | 29.907889 |
| Median Absolute Deviation (MAD) | 9 | 8 |
| Skewness | 0.30264535 | 0.35286689 |
| Sum | 10206.92 | 10766.84 |
| Variance | 212.90365 | 216.21224 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 18 | 18 | 4.0% |
| 22 | 15 | 3.4% |
| 28 | 14 | 3.1% |
| 24 | 13 | 2.9% |
| 21 | 12 | 2.7% |
| 19 | 12 | 2.7% |
| 36 | 12 | 2.7% |
| 30 | 12 | 2.7% |
| 25 | 12 | 2.7% |
| 27 | 10 | 2.2% |
| Other values (64) | 226 | |
| (Missing) | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 28 | 15 | 3.4% |
| 22 | 15 | 3.4% |
| 18 | 14 | 3.1% |
| 24 | 14 | 3.1% |
| 25 | 13 | 2.9% |
| 29 | 12 | 2.7% |
| 21 | 12 | 2.7% |
| 36 | 11 | 2.5% |
| 30 | 11 | 2.5% |
| 27 | 11 | 2.5% |
| Other values (66) | 232 | |
| (Missing) | 86 | 19.3% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 5 | |
| 2 | 4 | |
| 3 | 2 | 0.4% |
| 4 | 8 | |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 4 | |
| 2 | 7 | |
| 3 | 3 | |
| 4 | 2 | 0.4% |
| 5 | 4 | |
| 6 | 1 | 0.2% |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 4 | |
| 2 | 7 | |
| 3 | 3 | |
| 4 | 2 | 0.4% |
| 5 | 4 | |
| 6 | 1 | 0.2% |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 1 | 0.2% |
| 0.83 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 5 | |
| 2 | 4 | |
| 3 | 2 | 0.4% |
| 4 | 8 | |
| 5 | 3 | 0.7% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.55605381 | 0.59192825 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 303 | 297 |
| Zeros (%) | 67.9% | 66.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 3 | 3 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.1747327 | 1.1914781 |
| Coefficient of variation (CV) | 2.1126242 | 2.0128759 |
| Kurtosis | 15.708763 | 14.263495 |
| Mean | 0.55605381 | 0.59192825 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.5324715 | 3.327934 |
| Sum | 248 | 264 |
| Variance | 1.379997 | 1.4196201 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 303 | |
| 1 | 103 | 23.1% |
| 2 | 13 | 2.9% |
| 4 | 10 | 2.2% |
| 3 | 9 | 2.0% |
| 8 | 4 | 0.9% |
| 5 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 297 | |
| 1 | 100 | 22.4% |
| 2 | 20 | 4.5% |
| 3 | 12 | 2.7% |
| 4 | 9 | 2.0% |
| 8 | 4 | 0.9% |
| 5 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 303 | |
| 1 | 103 | 23.1% |
| 2 | 13 | 2.9% |
| 3 | 9 | 2.0% |
| 4 | 10 | 2.2% |
| 5 | 4 | 0.9% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 297 | |
| 1 | 100 | 22.4% |
| 2 | 20 | 4.5% |
| 3 | 12 | 2.7% |
| 4 | 9 | 2.0% |
| 5 | 4 | 0.9% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 297 | |
| 1 | 100 | 22.4% |
| 2 | 20 | 4.5% |
| 3 | 12 | 2.7% |
| 4 | 9 | 2.0% |
| 5 | 4 | 0.9% |
| 8 | 4 | 0.9% |
| Value | Count | Frequency (%) |
| 0 | 303 | |
| 1 | 103 | 23.1% |
| 2 | 13 | 2.9% |
| 3 | 9 | 2.0% |
| 4 | 10 | 2.2% |
| 5 | 4 | 0.9% |
| 8 | 4 | 0.9% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 6 | 6 |
| Distinct (%) | 1.3% | 1.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.40134529 | 0.36098655 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 5 | 5 |
| Zeros | 330 | 346 |
| Zeros (%) | 74.0% | 77.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 5 | 5 |
| Range | 5 | 5 |
| Interquartile range (IQR) | 1 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.78347998 | 0.76566952 |
| Coefficient of variation (CV) | 1.9521345 | 2.1210472 |
| Kurtosis | 6.8442499 | 7.4431605 |
| Mean | 0.40134529 | 0.36098655 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.3434017 | 2.4817129 |
| Sum | 179 | 161 |
| Variance | 0.61384088 | 0.58624981 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 330 | |
| 1 | 65 | 14.6% |
| 2 | 45 | 10.1% |
| 3 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 346 | |
| 1 | 50 | 11.2% |
| 2 | 44 | 9.9% |
| 3 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| 4 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 330 | |
| 1 | 65 | 14.6% |
| 2 | 45 | 10.1% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 346 | |
| 1 | 50 | 11.2% |
| 2 | 44 | 9.9% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 346 | |
| 1 | 50 | 11.2% |
| 2 | 44 | 9.9% |
| 3 | 3 | 0.7% |
| 4 | 1 | 0.2% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 330 | |
| 1 | 65 | 14.6% |
| 2 | 45 | 10.1% |
| 3 | 2 | 0.4% |
| 4 | 2 | 0.4% |
| 5 | 2 | 0.4% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 378 | 381 |
| Distinct (%) | 84.8% | 85.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.7869955 | 6.7331839 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3027 | 3003 |
| Distinct characters | 35 | 32 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 329 | 335 ? |
| Unique (%) | 73.8% | 75.1% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | PC 17601 | A/5. 851 |
| 2nd row | PC 17477 | 243847 |
| 3rd row | 349217 | 370375 |
| 4th row | W./C. 6609 | 335677 |
| 5th row | 36866 | 2653 |
| Value | Count | Frequency (%) |
| pc | 32 | 5.6% |
| c.a | 11 | 1.9% |
| ston/o | 9 | 1.6% |
| 2 | 9 | 1.6% |
| ca | 8 | 1.4% |
| sc/paris | 6 | 1.1% |
| a/5 | 5 | 0.9% |
| soton/oq | 5 | 0.9% |
| 1601 | 5 | 0.9% |
| 3101295 | 4 | 0.7% |
| Other values (398) | 475 |
| Value | Count | Frequency (%) |
| pc | 35 | 6.2% |
| a/5 | 12 | 2.1% |
| c.a | 11 | 1.9% |
| ca | 9 | 1.6% |
| 2 | 7 | 1.2% |
| ston/o | 7 | 1.2% |
| ston/o2 | 6 | 1.1% |
| sc/paris | 5 | 0.9% |
| 347082 | 5 | 0.9% |
| 2144 | 4 | 0.7% |
| Other values (401) | 468 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 356 | |
| 2 | 308 | |
| 7 | 227 | 7.5% |
| 4 | 222 | 7.3% |
| 0 | 205 | 6.8% |
| 6 | 197 | 6.5% |
| 5 | 191 | 6.3% |
| 9 | 189 | 6.2% |
| 8 | 141 | 4.7% |
| Other values (25) | 624 |
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 359 | |
| 2 | 309 | |
| 7 | 249 | |
| 4 | 222 | 7.4% |
| 6 | 207 | 6.9% |
| 0 | 204 | 6.8% |
| 5 | 189 | 6.3% |
| 9 | 153 | 5.1% |
| 8 | 153 | 5.1% |
| Other values (22) | 585 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3027 |
| Value | Count | Frequency (%) |
| (unknown) | 3003 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 356 | |
| 2 | 308 | |
| 7 | 227 | 7.5% |
| 4 | 222 | 7.3% |
| 0 | 205 | 6.8% |
| 6 | 197 | 6.5% |
| 5 | 191 | 6.3% |
| 9 | 189 | 6.2% |
| 8 | 141 | 4.7% |
| Other values (25) | 624 |
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 359 | |
| 2 | 309 | |
| 7 | 249 | |
| 4 | 222 | 7.4% |
| 6 | 207 | 6.9% |
| 0 | 204 | 6.8% |
| 5 | 189 | 6.3% |
| 9 | 153 | 5.1% |
| 8 | 153 | 5.1% |
| Other values (22) | 585 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3027 |
| Value | Count | Frequency (%) |
| (unknown) | 3003 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 356 | |
| 2 | 308 | |
| 7 | 227 | 7.5% |
| 4 | 222 | 7.3% |
| 0 | 205 | 6.8% |
| 6 | 197 | 6.5% |
| 5 | 191 | 6.3% |
| 9 | 189 | 6.2% |
| 8 | 141 | 4.7% |
| Other values (25) | 624 |
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 359 | |
| 2 | 309 | |
| 7 | 249 | |
| 4 | 222 | 7.4% |
| 6 | 207 | 6.9% |
| 0 | 204 | 6.8% |
| 5 | 189 | 6.3% |
| 9 | 153 | 5.1% |
| 8 | 153 | 5.1% |
| Other values (22) | 585 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3027 |
| Value | Count | Frequency (%) |
| (unknown) | 3003 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 367 | |
| 1 | 356 | |
| 2 | 308 | |
| 7 | 227 | 7.5% |
| 4 | 222 | 7.3% |
| 0 | 205 | 6.8% |
| 6 | 197 | 6.5% |
| 5 | 191 | 6.3% |
| 9 | 189 | 6.2% |
| 8 | 141 | 4.7% |
| Other values (25) | 624 |
| Value | Count | Frequency (%) |
| 3 | 373 | |
| 1 | 359 | |
| 2 | 309 | |
| 7 | 249 | |
| 4 | 222 | 7.4% |
| 6 | 207 | 6.9% |
| 0 | 204 | 6.8% |
| 5 | 189 | 6.3% |
| 9 | 153 | 5.1% |
| 8 | 153 | 5.1% |
| Other values (22) | 585 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 193 | 176 |
| Distinct (%) | 43.3% | 39.5% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 33.238704 | 34.150466 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 10 | 8 |
| Zeros (%) | 2.2% | 1.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.129175 | 7.225 |
| Q1 | 7.925 | 7.925 |
| median | 14.4542 | 15.2458 |
| Q3 | 32.875 | 31.275 |
| 95-th percentile | 120 | 120 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 24.95 | 23.35 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 48.808475 | 53.762497 |
| Coefficient of variation (CV) | 1.4684229 | 1.574283 |
| Kurtosis | 26.360298 | 31.110773 |
| Mean | 33.238704 | 34.150466 |
| Median Absolute Deviation (MAD) | 6.9459 | 7.7104 |
| Skewness | 4.1759711 | 4.7014601 |
| Sum | 14824.462 | 15231.108 |
| Variance | 2382.2673 | 2890.4061 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 7.8958 | 20 | 4.5% |
| 8.05 | 19 | 4.3% |
| 13 | 17 | 3.8% |
| 7.75 | 14 | 3.1% |
| 10.5 | 13 | 2.9% |
| 7.775 | 12 | 2.7% |
| 7.925 | 10 | 2.2% |
| 0 | 10 | 2.2% |
| 26 | 9 | 2.0% |
| 26.55 | 7 | 1.6% |
| Other values (183) | 315 |
| Value | Count | Frequency (%) |
| 13 | 24 | 5.4% |
| 7.8958 | 21 | 4.7% |
| 7.75 | 18 | 4.0% |
| 8.05 | 18 | 4.0% |
| 26 | 16 | 3.6% |
| 7.775 | 13 | 2.9% |
| 7.925 | 12 | 2.7% |
| 8.6625 | 9 | 2.0% |
| 26.55 | 8 | 1.8% |
| 0 | 8 | 1.8% |
| Other values (166) | 299 |
| Value | Count | Frequency (%) |
| 0 | 10 | |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 | 0.9% |
| 7.0542 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| 7.0542 | 1 | 0.2% |
| 7.125 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 4.0125 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| 7.0542 | 1 | 0.2% |
| 7.125 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 10 | |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 | 0.9% |
| 7.0542 | 1 | 0.2% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 85 | 88 |
| Distinct (%) | 81.0% | 86.3% |
| Missing | 341 | 344 |
| Missing (%) | 76.5% | 77.1% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.7142857 | 3.6078431 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 390 | 368 |
| Distinct characters | 18 | 17 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 67 | 75 ? |
| Unique (%) | 63.8% | 73.5% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | B35 | E12 |
| 2nd row | D47 | C103 |
| 3rd row | C85 | C46 |
| 4th row | B38 | C101 |
| 5th row | C52 | C78 |
| Value | Count | Frequency (%) |
| b96 | 3 | 2.4% |
| c22 | 3 | 2.4% |
| c26 | 3 | 2.4% |
| b98 | 3 | 2.4% |
| b20 | 2 | 1.6% |
| g6 | 2 | 1.6% |
| c52 | 2 | 1.6% |
| b66 | 2 | 1.6% |
| b63 | 2 | 1.6% |
| b57 | 2 | 1.6% |
| Other values (86) | 102 |
| Value | Count | Frequency (%) |
| f2 | 3 | 2.5% |
| b77 | 2 | 1.7% |
| c23 | 2 | 1.7% |
| c123 | 2 | 1.7% |
| c78 | 2 | 1.7% |
| d17 | 2 | 1.7% |
| b98 | 2 | 1.7% |
| c26 | 2 | 1.7% |
| c22 | 2 | 1.7% |
| b96 | 2 | 1.7% |
| Other values (89) | 97 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 43 | |
| B | 39 | |
| C | 38 | |
| 3 | 34 | 8.7% |
| 6 | 33 | 8.5% |
| 1 | 25 | 6.4% |
| 7 | 23 | 5.9% |
| 5 | 22 | 5.6% |
| 8 | 21 | 5.4% |
| 21 | 5.4% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| C | 42 | |
| 2 | 41 | |
| B | 35 | 9.5% |
| 1 | 34 | 9.2% |
| 5 | 25 | 6.8% |
| 3 | 22 | 6.0% |
| 7 | 21 | 5.7% |
| 8 | 20 | 5.4% |
| 6 | 20 | 5.4% |
| 0 | 20 | 5.4% |
| Other values (7) | 88 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 390 |
| Value | Count | Frequency (%) |
| (unknown) | 368 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 43 | |
| B | 39 | |
| C | 38 | |
| 3 | 34 | 8.7% |
| 6 | 33 | 8.5% |
| 1 | 25 | 6.4% |
| 7 | 23 | 5.9% |
| 5 | 22 | 5.6% |
| 8 | 21 | 5.4% |
| 21 | 5.4% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| C | 42 | |
| 2 | 41 | |
| B | 35 | 9.5% |
| 1 | 34 | 9.2% |
| 5 | 25 | 6.8% |
| 3 | 22 | 6.0% |
| 7 | 21 | 5.7% |
| 8 | 20 | 5.4% |
| 6 | 20 | 5.4% |
| 0 | 20 | 5.4% |
| Other values (7) | 88 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 390 |
| Value | Count | Frequency (%) |
| (unknown) | 368 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 43 | |
| B | 39 | |
| C | 38 | |
| 3 | 34 | 8.7% |
| 6 | 33 | 8.5% |
| 1 | 25 | 6.4% |
| 7 | 23 | 5.9% |
| 5 | 22 | 5.6% |
| 8 | 21 | 5.4% |
| 21 | 5.4% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| C | 42 | |
| 2 | 41 | |
| B | 35 | 9.5% |
| 1 | 34 | 9.2% |
| 5 | 25 | 6.8% |
| 3 | 22 | 6.0% |
| 7 | 21 | 5.7% |
| 8 | 20 | 5.4% |
| 6 | 20 | 5.4% |
| 0 | 20 | 5.4% |
| Other values (7) | 88 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 390 |
| Value | Count | Frequency (%) |
| (unknown) | 368 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 43 | |
| B | 39 | |
| C | 38 | |
| 3 | 34 | 8.7% |
| 6 | 33 | 8.5% |
| 1 | 25 | 6.4% |
| 7 | 23 | 5.9% |
| 5 | 22 | 5.6% |
| 8 | 21 | 5.4% |
| 21 | 5.4% | |
| Other values (8) | 91 |
| Value | Count | Frequency (%) |
| C | 42 | |
| 2 | 41 | |
| B | 35 | 9.5% |
| 1 | 34 | 9.2% |
| 5 | 25 | 6.8% |
| 3 | 22 | 6.0% |
| 7 | 21 | 5.7% |
| 8 | 20 | 5.4% |
| 6 | 20 | 5.4% |
| 0 | 20 | 5.4% |
| Other values (7) | 88 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 2 |
| Missing (%) | 0.2% | 0.4% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 444 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | C | S |
| 2nd row | C | S |
| 3rd row | S | Q |
| 4th row | S | Q |
| 5th row | Q | C |
Common Values
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 85 | 19.1% |
| Q | 34 | 7.6% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 94 | 21.1% |
| Q | 35 | 7.8% |
| (Missing) | 2 | 0.4% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 326 | |
| c | 85 | 19.1% |
| q | 34 | 7.6% |
| Value | Count | Frequency (%) |
| s | 315 | |
| c | 94 | 21.2% |
| q | 35 | 7.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 85 | 19.1% |
| Q | 34 | 7.6% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 94 | 21.2% |
| Q | 35 | 7.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 444 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 85 | 19.1% |
| Q | 34 | 7.6% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 94 | 21.2% |
| Q | 35 | 7.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 444 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 85 | 19.1% |
| Q | 34 | 7.6% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 94 | 21.2% |
| Q | 35 | 7.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 444 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 326 | |
| C | 85 | 19.1% |
| Q | 34 | 7.6% |
| Value | Count | Frequency (%) |
| S | 315 | |
| C | 94 | 21.2% |
| Q | 35 | 7.9% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 30 | 31 | 0 | 1 | Uruchurtu, Don. Manuel E | male | 40.0 | 0 | 0 | PC 17601 | 27.7208 | NaN | C |
| 369 | 370 | 1 | 1 | Aubart, Mme. Leontine Pauline | female | 24.0 | 0 | 0 | PC 17477 | 69.3000 | B35 | C |
| 878 | 879 | 0 | 3 | Laleff, Mr. Kristo | male | NaN | 0 | 0 | 349217 | 7.8958 | NaN | S |
| 235 | 236 | 0 | 3 | Harknett, Miss. Alice Phoebe | female | NaN | 0 | 0 | W./C. 6609 | 7.5500 | NaN | S |
| 727 | 728 | 1 | 3 | Mannion, Miss. Margareth | female | NaN | 0 | 0 | 36866 | 7.7375 | NaN | Q |
| 136 | 137 | 1 | 1 | Newsom, Miss. Helen Monypeny | female | 19.0 | 0 | 2 | 11752 | 26.2833 | D47 | S |
| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38.0 | 1 | 0 | PC 17599 | 71.2833 | C85 | C |
| 536 | 537 | 0 | 1 | Butt, Major. Archibald Willingham | male | 45.0 | 0 | 0 | 113050 | 26.5500 | B38 | S |
| 321 | 322 | 0 | 3 | Danoff, Mr. Yoto | male | 27.0 | 0 | 0 | 349219 | 7.8958 | NaN | S |
| 526 | 527 | 1 | 2 | Ridsdale, Miss. Lucy | female | 50.0 | 0 | 0 | W./C. 14258 | 10.5000 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 153 | 154 | 0 | 3 | van Billiard, Mr. Austin Blyler | male | 40.5 | 0 | 2 | A/5. 851 | 14.5000 | NaN | S |
| 600 | 601 | 1 | 2 | Jacobsohn, Mrs. Sidney Samuel (Amy Frances Christy) | female | 24.0 | 2 | 1 | 243847 | 27.0000 | NaN | S |
| 274 | 275 | 1 | 3 | Healy, Miss. Hanora "Nora" | female | NaN | 0 | 0 | 370375 | 7.7500 | NaN | Q |
| 32 | 33 | 1 | 3 | Glynn, Miss. Mary Agatha | female | NaN | 0 | 0 | 335677 | 7.7500 | NaN | Q |
| 381 | 382 | 1 | 3 | Nakid, Miss. Maria ("Mary") | female | 1.0 | 0 | 2 | 2653 | 15.7417 | NaN | C |
| 214 | 215 | 0 | 3 | Kiernan, Mr. Philip | male | NaN | 1 | 0 | 367229 | 7.7500 | NaN | Q |
| 661 | 662 | 0 | 3 | Badt, Mr. Mohamed | male | 40.0 | 0 | 0 | 2623 | 7.2250 | NaN | C |
| 281 | 282 | 0 | 3 | Olsson, Mr. Nils Johan Goransson | male | 28.0 | 0 | 0 | 347464 | 7.8542 | NaN | S |
| 460 | 461 | 1 | 1 | Anderson, Mr. Harry | male | 48.0 | 0 | 0 | 19952 | 26.5500 | E12 | S |
| 11 | 12 | 1 | 1 | Bonnell, Miss. Elizabeth | female | 58.0 | 0 | 0 | 113783 | 26.5500 | C103 | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 384 | 385 | 0 | 3 | Plotcharsky, Mr. Vasil | male | NaN | 0 | 0 | 349227 | 7.8958 | NaN | S |
| 402 | 403 | 0 | 3 | Jussila, Miss. Mari Aina | female | 21.0 | 1 | 0 | 4137 | 9.8250 | NaN | S |
| 410 | 411 | 0 | 3 | Sdycoff, Mr. Todor | male | NaN | 0 | 0 | 349222 | 7.8958 | NaN | S |
| 628 | 629 | 0 | 3 | Bostandyeff, Mr. Guentcho | male | 26.0 | 0 | 0 | 349224 | 7.8958 | NaN | S |
| 646 | 647 | 0 | 3 | Cor, Mr. Liudevit | male | 19.0 | 0 | 0 | 349231 | 7.8958 | NaN | S |
| 76 | 77 | 0 | 3 | Staneff, Mr. Ivan | male | NaN | 0 | 0 | 349208 | 7.8958 | NaN | S |
| 5 | 6 | 0 | 3 | Moran, Mr. James | male | NaN | 0 | 0 | 330877 | 8.4583 | NaN | Q |
| 139 | 140 | 0 | 1 | Giglio, Mr. Victor | male | 24.0 | 0 | 0 | PC 17593 | 79.2000 | B86 | C |
| 632 | 633 | 1 | 1 | Stahelin-Maeglin, Dr. Max | male | 32.0 | 0 | 0 | 13214 | 30.5000 | B50 | C |
| 409 | 410 | 0 | 3 | Lefebre, Miss. Ida | female | NaN | 3 | 1 | 4133 | 25.4667 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 409 | 410 | 0 | 3 | Lefebre, Miss. Ida | female | NaN | 3 | 1 | 4133 | 25.4667 | NaN | S |
| 592 | 593 | 0 | 3 | Elsbury, Mr. William James | male | 47.0 | 0 | 0 | A/5 3902 | 7.2500 | NaN | S |
| 342 | 343 | 0 | 2 | Collander, Mr. Erik Gustaf | male | 28.0 | 0 | 0 | 248740 | 13.0000 | NaN | S |
| 505 | 506 | 0 | 1 | Penasco y Castellana, Mr. Victor de Satode | male | 18.0 | 1 | 0 | PC 17758 | 108.9000 | C65 | C |
| 777 | 778 | 1 | 3 | Emanuel, Miss. Virginia Ethel | female | 5.0 | 0 | 0 | 364516 | 12.4750 | NaN | S |
| 845 | 846 | 0 | 3 | Abbing, Mr. Anthony | male | 42.0 | 0 | 0 | C.A. 5547 | 7.5500 | NaN | S |
| 378 | 379 | 0 | 3 | Betros, Mr. Tannous | male | 20.0 | 0 | 0 | 2648 | 4.0125 | NaN | C |
| 377 | 378 | 0 | 1 | Widener, Mr. Harry Elkins | male | 27.0 | 0 | 2 | 113503 | 211.5000 | C82 | C |
| 188 | 189 | 0 | 3 | Bourke, Mr. John | male | 40.0 | 1 | 1 | 364849 | 15.5000 | NaN | Q |
| 887 | 888 | 1 | 1 | Graham, Miss. Margaret Edith | female | 19.0 | 0 | 0 | 112053 | 30.0000 | B42 | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||